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Abstract. Probabilistic range query (PRQ) over uncertain moving ob- 
jects, recently, is attracting more and more attentions due to the fact 
it can help people extract their interested moving objects with quanti- 
tative probability. Previous works usually assumed objects moving on a 
well defined route, or assumed objects moving freely in 2D space. Re- 
garding to the former, it is feasible for query processing on road networks 
where the roads are took as a series of line segments, we call this kind 
of query as "coarse-grained" query. However, for query processing on re- 
gion, say "fine-grained" query, their methods can not work. For the later, 
it is suitable for query processing on region, but it is impractical in real 
applications since various obstacles may limit the movement of moving 
objects. In this paper, we introduce the concept of restricted areas, and 
we are interesting in "fine-grained" query. Specifically, we consider the 
problem of PRQ over uncertain moving objects in constrained 2D space. 
Obviously, achieving these goals is significant, since it is closer to the 
real applications, and contributes to returning more authentic answers. 
For achieving the goals, we proposed our solution and demonstrated the 
effectiveness and efficiency of our proposed approaches by extensive ex- 
periments under various experimental settings. 



1 Introduction 

Recently, with the rapid development of positioning technologies like GPS, RFID 
and WSN, as well as the broad application of Location-Based Services [T] in 
many scenarios such as digital battlefield, traffic control, mobile workforce man- 
agement, transportation industry and so on, range query as one of common 
operation in moving object database has been attracting more and more at- 
tentions |2I3I4I5I6I7I8I9I10I11| . In general, mobile objects report their locations 
to corresponding server through a wireless interface, or the objects are tracked 
through ground-based radars or satellites. However, positioning technologies as 
well as other manners still cannot assure 100% accuracy, namely, the data we 
acquired itself may be not completely correct comparing to the real physical 
locations [12] . Moreover, due to limited network bandwidth and limited battery 
power of the mobile devices, it is often infeasible, for the database, to contain 
the total status of an entity ( which is monitored at every moment in time) |13j , 



so we just can obtain discrete location information, that is, the specific position 
between two continuous sampling is uncertain. In order to alleviate these prob- 
lems, the idea of incorporating uncertainty into moving objects data has been 
proposed 14 . A common model, for characterizing the location uncertainty of 
an object, is a closed region together with a probability density function (pdf) 
115ll3ll4ll6ll7HMg] . 

In the literatures of moving object database, there already was a large bulk 
of works on PRQ over uncertain moving objects [19120112121122116123124113117] . 
On the whole, previous works usually assumed objects moving on a well defined 
route [25114) . or assumed objects moving freely in 2 dimensional space |16ll9j . 
Regarding to the former, it is reasonable and feasible for "coarse-grained" query. 
For example, a frequently used method, for query processing on road networks, 
is using a graph to denote road networks, where roads can be took as a series of 
line segments (from the macro point of view). However, for query processing on 
region, say "fine-grained" query, their methods can not work. For the latter, it is 
suitable for "fine-grained" query, and it has less restriction on the movement way 
of objects, but it is impractical in realistic applications, since there are various 
obstacles that limit the movement of objects. For example, an automobile usually 
cannot run in a lake or river. In this paper, we introduce the concept of restricted 
area, and we are interested in "fine-grained" query. Obviously, that we introduce 
some restricted areas is meaningful. First, it is more coming close to the real 
applications. Second, we can obtain more authentic answers by incorporating 
these additional information. In particular, we observe that we may get incorrect 
answers if we ignore the constraints. The main reason is derived from ignoring 
the change of uncertainty region^ and pdf. 

As an example, see Figured! the small black dot Oj(l < j < 4) denotes the 
moving object , the biggest rectangle R denotes query range, the small rectangle 
Bk(l < k < 4) denotes barrier, we assume Oj cannot reach at the interior of 
these regions, it is as similar as taxies cannot run in lakes (or buildings, etc.). 
For ease of discussion, we assume Oj is subjected to uniform distribution in its 
uncertainty region, and we term the uncertainty region of Oj as U Rj . Then, the 
probability of Oj locating in R is equal to the ratio of two areas (i.e., the area 
of URj R and the area of URj). 

Given a query "retrieving the objects that are currently locating in R with 
a non-zero probability". Figure 1(a) depicts the case ignoring the constraints, 
where the shallow grey circle (Oj.Q for short ) illustrates the uncertainty region. 
In this case, the query answer will be { (O x , =100%), (0 3 , >50%), (O 4 ,<50%)}, 
where "<50%" denotes a specific decimal that is large than and less than 



0.5, other symbols have similar meanings. On the contrary, Figure 1(b) presents 
the case considering constraints, where Oj.Q can not be simply took as the 
uncertainty region of Oj, the real URj is the region that Oj.Q subtracts the 
barrier region, e.g., U R^—{0^. Q — -B3). In this case, the query answer will be { 
(Oi,=100%), (O 3 ,<50%), (O 4 ,>50%)}. We observe that the above two answers 



we mean the union of all possible positions where object locates before the next 
sampling, it can be derived based on last location and distance threshold. 
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are different, and it is not difficult to find out that the first answer is an incorrect 
answer. 




(a) (b) 
Fig. 1. Illustration of PRQ over uncertain moving objects 



Motivated by the above fact, in this paper, we investigate the problem of 
PRQ over uncertain moving objects in constrained 2D space. To the best of 
our knowledge, this is the first effort to address this problem. We present our 
solution for answering this kind of query, the proposed approach is simple and 
easily to implement, but without loss of efficiency and effectiveness, it can be 
easily extended to the real applications. Simply to speaking, we use polygons to 
represent restricted areas, and we approximate non-polygon entities (e.g., Oj.Q) 
to polygons, we utilize tactfully MBRs of various entities to help us pick out 
useful candidate entities. In order to improve the search efficiency, we use classical 
R-tree to index, and discuss two R-tree based indexing schemes. We present a 
label based data structure that is convenient for representing uncertainty region 
as well as other related entities (e.g., ISj), at the same time, it contributes to the 
follow-up calculation. By well analysing the geometry relation between different 
entities, we design elegant algorithms for computing the uncertainty region and 
the intersection (between uncertainty region and query range) . For obtaining the 
probability, we present two methods, quick method and Monte Carlo method, 
for different types of pdf. Experiment results demonstrate that our solution is 
very efficient and effective. To summarize, we make the following contributions. 

— We refine primitive uncertain moving object model by introducing the con- 
cept of restricted areas, and re-formulate traditional query based on the 
refined model. 

— We, by well analysing the nature of our problem, devise a framework and 
detailed algorithms for achieving our goals. 

— We demonstrate the performance of our proposed methods through extensive 
experiments under different experiment settings. 

The rest of the paper is organized as follows. In successive Section, we for- 
malize the problem. We introduce the preliminaries and analyse the nature of 
some vital entities in Section [31 We present the query processing framework and 
detailed algorithms in Section 21 and discuss the indexing schemes in Section [SJ 
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We evaluate the performance of our proposed methods through extensive exper- 
iments in Section [13 We, in Section [71 review related works. Finally, we conclude 
this paper and present our future research directions in Section [8] 



2 Problem Definition 

Given there is a territory with a number M of disjointed restricted areas (RAs) , 
and there are a number N of moving objects (MOs) which are continuous and 
freely moving in the territory but cannot enter into those RAs. We assume we 
already stored the location information of each MO in the server. In addition, we 
assume that a MO reports its location to the sever when the deviation between 
its current location and last reported location is larger than a distance threshold 
(DT). Formally, we term the territory as T (which is a 2D space), disjointed 
restricted area as RAi(l < i < M), moving object as Oj(l < j < N). For 
Oj, we term the location at an arbitrary instant of time t as (L*.X, L*.Y), last 
sampling/reporting time as last, current time as now, specific location at last 
sampling/reporting time as (L l ' ist -X, L l °- st .Y), distance threshold as DTj. In 



Since the location of a MO is continuously changing, it is unreasonable if 
we simply use the last obtained location as the current location. Essentially, 
it is uncertain about whereabout of the specific location at current time. One 
common model [15 13 14I16I17I18I19] . for capturing the location uncertainty of 
a MO, is comprised by two components: 

Definition 1. The uncertainty region (UR) of a moving object Oj at an 
instant of time t, denoted by UR* , is a closed region where Oj can be found. 

Definition 2. The uncertainty probability density function of Oj at time 
t, denoted by fj t (x,y), is a probability density function (pdf) of Oj 's location at 
an instant of time t, it has the value of if outside U R* . 

Since fj t (x, y) is a pdf, in theory, it has the property that J URt f](x,y)dxdy — 

1. In general, the uncertainty region UR* under distance based update policy 
can be derived by the formula: UR* = n ■ (DTj) 2 , where DTj is the distance 
threshold of Oj, and the centre of UR* is the last sampling/reporting loca- 
tion (L l °" st .X, L l °" st .Y). For convenience, we use Oj.Q to denote this region. The 
above representation is feasible if there is no any barrier, it can not work once 
there exist RAs. Therefore, a real uncertainty region for our problem should be 
as follows. 



Note that, under this case, in any two different instant of time t\ and t2 (ti, ti G 



(last, now]), we have UR* 1 — UR* 2 , and fj-{x,y) = f* 2 (x,y). In view of these, 




!j.X,L*jY) i 




(1) 
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we, unless stated otherwise, will use URj and fj(x,y) to denote uncertainty 
region and pdf of Oj , respectively. 

Formally, given a territory T with a number M of disjointed restricted areas, 
and a number N of moving objects, which are continuous moving on the non- 
restricted areas in T, and a closed query range R. 

Definition 3. A Probability Range Query (PRQ) over uncertain moving 
objects, in the constrained 2D space, returns a serious of tuple inform of (Oj,Pj), 
where Pj is the nonzero probability of Oj locating in the closed region R. 

Specifically, in this paper, we focus on snapshot query, and we always assume 
any two MOs can not locate in same location at same instant of time. For 
convenience, we summarize the main symbols in Table [T] 

Table 1. Main symbols used in this paper 



Symbols 


Description 


R 


query range 


RAt 


the i restricted area 




the j moving object 


URj 


uncertainty region of the j moving object 


DTj 


distance threshold of the j moving object 


fj( x ,y) 


pdf of the j moving object 


EP 3 


the approximated equilateral polygon derived from Oj.Q 


IS 3 


the intersection between R and URj 


Pj 


probability of Oj locating in query range R 



3 Preliminaries 

3.1 Representation of the basic entities 

Since the realistic application environments are vary from place to place, the 
shapes of RAs should be diversified. Our objective is building a general ap- 
proach rather than focusing on certain specific environment. In this paper, we 
use polygon to represent RA. For Oj.Q, which is a circle, we approximate it to an 
equilateral polygon (EP). Generally speaking, we can get more accurate result if 
we use more equivalent edges. For clearness, we call the EP derived from Oj.Q 
as EPj. Note that, we let Oj.Q be the circumscribed circle of EPj. By doing so, 
we can assure that the distance, from any point in EPj to the center, always is 
lessen than its' distance threshold DTj. The reasons we do this transforming are 
two folds. First, it is convenient for the follow-up calculation since operating on 
line segments, in most cases, is more simple and efficient than on arcs. Second, it 
is easy to represent the result of calculation. The method of transforming Oj.Q 
to EPj is very simple, we just let X k = L\ ast .X + DTj • cos((fc - 1) • 2ir/EL), 
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and Yfc = L\ ast Y + DTj ■ sin((fc - 1) • 2n/EL), where EL is the number of edges 
(of EPj), fce[l, 2, . . . , EL], (X k , r fe ) is the k th vertex of EPj. 

3.2 Notations of geometry relations and operations 

Given two arbitrary simple polygon^ PA and PB. In general, there are five 
types of geometry relations: PA n PB, PA C PB, PA D PB, PA = PB, 
PA tf PB. Where PA = PB means that the two polygons are totally coinciding. 
PA X\ PB means that PA and PB are disjointed. The subtraction operation 
and intersection operation are two common geometry operations, which are used 
extensively in this paper. For clearness, we use "PA — PB" and a PA $ PB" to 
denote them, respectively. 

3.3 Candidate moving objects and candidate restricted areas 

For ease of discussion, in the rest of this paper, we use BRAi to denote the 
MBR of RAi, and use BRAi.left, BRA r .right, BRAi.top and BRAi.bottom 
to denote the boundary of BRAi. Unless stated otherwise, we deal with other 
MBRs similarly (e.g., we use BEP 3 to denote the MBR of EPj). 

Definition 4. Given a number N of moving objects, their distance thresholds, 
and a closed region R, we use their MBRs to prune the unrelated moving objects, 
all moving objects that are not been pruned are the candidate moving objects 
(CMOs). 

Definition 5. Given a number M of restricted areas, a moving object Oj and 
its ' distance threshold, we use their MBRs to prune unrelated restricted areas, all 
restricted areas that are not been pruned are Oj 's candidate restricted areas 
(CRAs). 

3.4 Understanding the UR 

Given a number of RAs, generally speaking, there are three types of cases for 
a MO Oj. (1) There is no RA intersecting with EPj. (2) That EPj subtracts 
all RAs is a single region. (3) That EPj subtracts all RAs has no less than 
two subdivisions. As an example, grey polygon denotes RA, black dot denotes 
Oj, equilateral polygon denotes EPj (j 6 [1, ■•• ,4]), as shown in Figure [2j 
Obviously, 0\ belongs to the first class, its' UR is EP\. <9 2 and O3 belong to 
the second class. For <9 2 (or O3), its' UR is the result of that EP 2 (or EP 3 ) 
subtracts all RAs. O4 belongs to the third class, that EP4 subtracts all RAs has 
two subdivisions, Si and S2, its' UR is This is because one (from the center 
point of EP4) cannot reach any location in S\ if he do not walk outside the EP4. 
Therefore, Si should be discarded. 

Note that, there is a hole in UR3. In here, we give the definition of hole, 
outer ring, inner ring. 
2 

In here, a simple polygon, we mean it is not self-intersection, and there is no holeis) in it. 



G 



Fig. 2. Illustration of UR 



Definition 6. Given three closed region Ri, R2, R3, in which R\ C R2 and R3 = 
R2 — R\- Then, we call the periphery of R2 and that of Ri as the closed region 
i?3 's outer ring and inner ring, respectively, and we call the region surrounded 
by i?3 's inner ring as a hole of R3 . 



3.5 Understanding the IS 

Let URj is a closed region with k (> 0) holes, query range R is a closed region. 
The intersection (IS) between R and U Rj can be parsed by understanding the 
relation between R and the outer ring (or inner ring(s)) of URj. For ease of 
discussion, we term the IS between URj and R as ISj, the outer ring of URj 
as OURj, and call the i th inner ring (or hole) of URj as IUR* (i < k). The 
geometry relation between OURj and R has 5 cases as shown in "Box Graph 
1". 



case 1. OURj ff R - 


» ISj = . 


case 2. OURj C R 


► /S 3 = [/ft,,- . 


case 3. OURj = ft 


► ISj = URj . 


case 4. OURj D R 




case 4.1. k — 


-> ISj = ft. 


case 4.2. k ^ 


-> ISj = ft - E?=l I^ftj. 


case 5. OURj n ft 




case 5.1. k — 


-> /Sj = OURj § ft. 


case 5.2. k ^ 


-> ISj = (OURj 5 ft) - E?=i IUR). 


Box Graph 1 




case 4.2.1. ft = Jt/ft* - 


■* ISj = . 


case 4.2.2. R C It/ft* - 


-> IS, = 0. 


case 4.2.3. R TT IUR) - 


-> IUR* make no any impact on ISj. 


case 4.2.4. ft D IUR) - 


■> will be a hole of ISj. 


case 4.2.5. ft n IUR) - 


> IUR* possible subdivide i? . 


Box Graph 2 




case 5.2.1. OcR |T IUR) 


— » IUR* make no any impact on ISj. 


case 5.2.2. Ocft D IUR) 


— > IUR* will be a hole of ISj . 


case 5.2.3. Ocft n IUR) 


—> IUR* possible subdivide OcR. 


Box Graph 3 
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In the sequel, we mainly discuss case 4.2 and case 5.2 since other cases are 
straightforward. For case 4.2, the geometry relation between R and IURj also 
has 5 cases as shown in "Box Graph 2" . 

Note that, for case 4.2.5, given R is subdivided, by certain hole (e.g., IURj), 
into two subdivisions, say Si and S2. Next, Si (or S2) possible will be further 
subdivided, by another hole (e.g., IURj), into two subdivisions S^i and Si, 2 
(S2,i and S2,2), and so on. For case 5.2, we should consider the impact of IURj 
on u OU Rj (j i?" , we substitute u OU Rj $ R" with "OcR" for convenience, there 
are 3 cases as shown in "Box Graph 3" . 

For case 5.2.3, it is similar to case 4.2.5, given OcR is subdivided, by certain 
hole (e.g., IURj) into two subdivisions Si and S2. Next, Si (or S2) possible be 
further subdivided by other hole (e.g., IURj). 

3.6 Lable based data structure 

We observe that, UR may be a closed region with hole(s) or may just be a 
simple closed region. In addition, IS possible consists of multiple subdivisions 
with hole(s). For ease of operating on them in an unified manner, we present a 
label based data structure (LBDS), which consists of three domains, one label 
domain and two pointer domains. 

— Flag: the function of this domain is to tell us whether there is/are hole(s) 
in the entity. Specifically, when Flag is equal to 0, it means that there is no 
hole in it. Otherwise, there is no lessen than one hole. 

— OPointer: this domain points to a simple polygon, which denotes the outer 
ring of the entity. A simple polygon consists of five domains. 

• VPointer: this domain points to a linked list in which a series of vertexes 
are stored. 

• left: the left boundary of this polygon. 

• right: the right boundary of this polygon. 

• top: the top boundary of this polygon. 

• bottom: the bottom boundary of this polygon. 

— IPointer: this domain points to a linked list in which the simple polygon(s) 
is/are stored if Flag is not equal to 0, in here the simple polygon(s) denotes 
the inner ring(s) /hole(s) of this entity. 

The UR can be represented directly by the LBDS, the IS can be represented 
by a linked list in which a series of 'LDBS' are stored. 

4 Query Evaluation 
4.1 The Framework 

Figure |3] illustrates the pseudocode of the basic framework for answering PRQ 
over uncertain moving object in constrained 2D space. Specifically, there are 
several steps. First, we search the CMOs, which can be achieved by comparing 
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their MBRs (line 2). Second, for each CMO, we search its' CRAs, and compute 
its' UR and IS (line 4-6). Third, if the IS is equal to the UR (or 0), we assign the 
probability Pj with 1 (or 0). Otherwise, we obtain the probability by calculating 
the integral on IS (line 7-9). Fourth, if the Pj is not equal to 0, we store the 
'identity' of this CMO and its' probability, then shift to dealing with next CMO 
until all CMOs have been processed (line 10). At last, returning the result in 
which all CMOs that have non-zero probability are included (line 11). 

Procedure QueryFramework { 

Input: P, (L\ ast .X, L l f Bt .Y), DT 3 , RAi, f 3 (x,y) 
Output: U(Oj, Pj), where Pj > 

(1) Answers <— 0; 

(2) CMOs 4— the moving objects that may locate in Ft 

(3) for each Oj G CMOs 

(4) CRAs 4— searching candidate restricted areas 

(5) U Pj computing the uncertainty region of Oj 

(6) ISj <— computing U Rj § P 

(7) if (IS, ? 0) 

(8) if(/Sj=C/Pj) then Pj <- 1 

(9) else Pj <- / Jg fj(x,y)dxdy 

(10) if (Pj ^ 0) then Answers <- (Oj , Pj ) U Answers 

(11) return Answers } 

Fig. 3. PRQ over Uncertain Moving Objects in Constrained 2D Space 

Note that, there are two issues we should well consider in this framework. 
(1) Evaluating UR can not been took as simple " polygon-polygon subtraction", 
existing algorithm (e.g., [26]) only is the basis of this work. (2) Computing 
IS also is not simple "polygon-polygon intersection", existing algorithms (e.g., 
[27128129130] ). which support "intersection operation" on polygons with holes, 
but do not well consider the case where a large number of holes possible exist, 
and do not well consider that a subdivision possible be continuously subdivided. 
Moreover, it is very frequent to compute UR and IS in our query processing. 
Therefore, it is very necessary to consider the nature of this problem, then, to 
devise targeted algorithms. 

4.2 Computing the UR 

In this Subsection, we present our approach for computing the UR. We first 
address the tactics, and followed by presenting the detailed procedure. On the 
whole, we take use of four tactics. (1) We deal with the CRA that may result 
in multiple subdivisions to the entity (e.g., EPj) as early as possible. (2) Once 
multiple subdivisions appear, we update immediately the boundary, i.e., MBR. 
Otherwise, we do not update the boundary before we traversed all the CRAs. (3) 
Once we confirm the entity has ever been subdivided, we use the new boundary 
to prune the rest of CRAs. (4) We process the CRA that results in holes to 
current entity as late as possible. 
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CRA; 



cra; 



CRA' 6 ' 




.CRA; 



V-CRA 2 



-CRA 7 



~CRA 



CRA; 



CRA 4 | 




CRA; 



CRA, 



CRA 7 




(b) 



The span of CRA Given Oj has a number k of CRAs {CRA\, ■ ■■ , CRA k ). 
For obtaining the URj, a direct method is evaluating the subtraction between 
EPj and all the CRAs one by one. However, it is a bruteforcing solution. For 
illustrating the reason, we first give the definition of vertical span, horizontal 
span and span. 

Definition 7. Given RAi and BRAi, the horizontal span ofRAi is \BRAi.right— 
BRAi.left\, RAi.HS for short. Similarly, the vertical span of RAi is \BRAi.top— 
BRA,. bottom], RA t .VS for short. The span of RAi is MAX {RAi.HS, RA4.VS}. 

For CRA m and CRA n (m,n £ [1, • • • ,k], m ^ n), their spans may be 
different. In particular, we observe that a CRA with large span is more likely 
subdivide a single region into two (or more) subdivisions than a CRA with small 
span is. This observation will contribute to the evaluation of URj. In here, we 
sort all the CRAs of Oj, according to their spans, in descending order before 
evaluating the subtraction between EPj and CRAs. As an example, see Figure 
the big equilateral polygon denotes the EPj , grey polygon denotes candidate 
restricted area CRA m (m £ [1, ■ • • ,7]), and given their spans are decreasing from 
CRAi to CRA-?. Therefore, based on our tactics, we should deal with CRA\ at 
first, then deal with CRAi, and so on. Note that, after we deal with CRA\, 
there are two subdivisions, S\ and S2, as shown in Figure [4(b) | Based on the 
analysis in Subsection l3.4l we choose Si as the real subdivision. Thus, when we 
deal with CRA2, we can determine quickly that it is irrelevant with the final 
result. Otherwise, if we deal with CRA2 before we deal with CRA\, then we 
have to compute u EPj — CRA 2 " , which will incur extra cost. 

Tactic for handling hole Given we have sorted all the CRAs of Oj. Next, we 
can evaluate the subtraction between EPj and CRAs one by one. For clearness, 
we call the result of "EPj - CRAi" as EPj A, the result of "EPj A - CRA 2 " 
as EPj. 2, and so on. Note that, we use EPj and "EPj.O" interchangeably. In 
some cases, CRA m c EPj.(m — 1) (m £ [1, ••• ,k}). Obviously, CRA m in this 
time results in a hole to the current EPj.(m — 1). A straightforward method is 
computing a EPj.{m — 1) — CRA m " and updating the EPj.(m — 1) right now, 
and then to deal with the next CRA using the new EPj.m. However, we observe 
that doing in this manner will make the follow-up calculation complicate. In 
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here, we postpone the subtraction operation if CRA m results in a hole to the 
current EPj.(m — 1). Specifically, we store CRA m in another place and deal 
with it after we traversed all the rest of CRAs of Oj. Taking CRA3 in Figure 
|4(b)| as an example, it results in a hole to EPj.2, we store it in another place 
(e.g., a linked list 'uHoles'), and shift to dealing with CRA4, and so on. After 
we traversed all the rest of CRAs of Oj, we fetch CRA3 from the linked list 
, uHoles\ and to determine whether it results in a hole to EPj.7. In here, it is, 
so we put it as an inner ring of EPj.7. 

Updating regulations As stated above, the EPj is continuous evolving in 
the process of computing UR. If "CRA m n EPj.(m— 1)", then, the value of 
EPj.(m - 1) should be different from that of il EP r {m - 1) - CRA„" . In this 
case, we update by assigning the result of 11 EPj. (m — 1) —CRA m " to EPj.m. 
Correspondingly, the B EPj.m, i.e., MBR of EPj.m, may be changed. In princi- 
ple, once it is true, we should compute the new MBR and assign it to BEPj.m. 
However, we do not update the MBR right now, since we observe that, in most 
cases, the new MBR should not make enough contribution to the rest of compu- 
tation. In addition, if we manage to compute the new MBR, we have to traverse 
the vertexes of EPj.m, which will take a few of time. In here, we update the 
MBR in a lazy manner. Specifically, we update the MBR only in two cases: either 
CRA m subdividing EPj.(m — 1) into no lessen than 2 subdivisions, or having 
traversed all the CRAs of Oj. Taking CRA\ in Figure |4] as an example, after 
we dealt with it, two subdivisions are presented, in this time, we should update 
EPj and BEPj. Specifically, we assign S\ to EPj.l, and assign the MBR of Si 
to BEP r l. For CRA 2 and CRA 3 , ll CRA 2 |t EPj.l" and "C*iL4 3 C EPj. 2", 
thus, we let EPj.2 = EPj.l and EPj.3 = EPj.2 according to our tactic. That is, 
there is no any update when we deal with CRA 2 and CRA3. For CRA4, we let 
EPj A = EPj. 3 — CRA4, that is, we update EPj. 3. However, we do not update 
BEPj. 3 whatever the real MBR of "EPj. 3 - CRA±" is, since "EPj.3 - CRA 4 " 
does not result in multiple subdivisions. So we let BEPj A = BEPj. 3. 

The new boundary As previous discuss, we update the boundary when a 
CRA subdivides current entity, the new boundary will contribute to pruning 
the rest of unrelated CRAs. In here, we use a 'switch' to remember whether 
the entity has ever been subdivided. Once the 'switch' is set to 1, in the rest of 
computation, we always compare their MBRs before evaluating the subtraction. 
For example, the dashed rectangle in Figure [4(b)] is the new boundary after we 
dealt with CRA\, thus, for the rest of CRAs (from CRA 2 to CRA-j), we always 
use the new boundary to determine whether certain CRA can be pruned directly. 
In here, both CRA 2 and CRAq can be pruned directly. Note that, if the 'switch' 
is off (i.e., 0), we never compare their MBRs since it doubtlessly incurs cost but 
without any income in terms of the rest of computation. 

The procedure for computing UR Figure [5] depicts the pesodocode of 
computing UR, which consists of two procedures: HandRA( ) and EvaluateUR( 
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Procedure EvaluateUR ( ) { 

Input: (L\ ast .X, L\ ast .Y). DTj , JZf=i RA i 
Output: URj 



(1) URj <- 0; EPj <- Transform Oj.Q: BEPj (-MBR of EPj 

(2) CRAs 4— search candidate restricted areas based on BEPj and BRAi 

(3) if (CRAs ^ 0) 

(4) sorting CRAs; switch <— 0; uH~olcs<— 0: rH~olcs<— 

(5) for each candidate restricted area CRA m £ CRAs 

(6) if (switch = 0) then Procedure HandleCRA (uHolcs, CRA m , switch, EPj ) 

(7) else j j switch— 1 

(8) if (-.{BCRA m TtBEPj)) then 

(9) Procedure HandleCRA(uHoles,CKA m ,switch,.EP,) 

(10) if (uHolcs^ 0) then 

(11) for each CRA m S uHoles 

(12) if ((vertex of CRA m )£ EPj) then rHoles=rHolesUCRA m 

(13) if (rHoles ^ 0) then Let all CRAs (e rHoles) as the inner ring of EPj 

(14) URj 4- EPj ; 

(15) return URj } 



Procedure HandleCRA ( ) { 

Input: EPj, CRA m: uHolcs, switch 
(16) if (EPj n CRA m ) 



(17) EPj 4- EPj - Ci?A m ; SN 4- number of subdivisions in £Pj 

(18) if (SJV > 1) then 

(19) EPj 4— choose real subdivision from multiple subdivisions; update BEPj 

(20) if (switch — 0) then switch 4— 1 



(21) else // (EPj D CRA m ) or (EPj TT CflA,„) 

(22) if (BPj D CR,A m ) then uHolcs^ uHolesUCflA m } 

Fig. 5. Pseudocode for Computing UR 



). Note that, some symbols (e.g., EPj) used in the two procedures should be 
understood as similar as our previous discussion since these entities are evolving 
continuously. 

Procedure EvaluateUR( ) First we get the CRAs of Oj based on the MBRs(line 
1-2). If CRAs 7^ 0, we sort CRAs and initialize (line 4), where switch is used for 
indicating whether the EPj has ever been subdivided, uHoles is used for storing 
the CRA that results in hole to current EPj, rHoles is used for storing the CRA 
(G uHoles) that results in hole to the last EPj. Since we adopt lazy updating 
manner, once BEPj is updated, we should use the new MBR to determine 
whether a CRA can be pruned before dealing with it, otherwise, we deal with it 
directly using the procedure HandCRA( ) (line 6-9). Once we traversed all CRAs, 
we determine whether the CRA (g uHoles) results in hole to the last EPj, which 
can be achieved by determining "whether vertex from CRA locates in the last 
EPj". In here, we use Hormann et al. [31] proposed algorithm to process the 
point in polygon problem. All the CRAs (g rHoles) will be as the inner ring of 
the last EPj. In the end, we assign the last EPj to URj (line 10-15). 

Procedure HandCRA( ) First we determine the geometry relation between the 
current EPj and the CRA. There are 3 cases. For EPj nCRA, we use Margalit 
et al. 26 proposed algorithm to do subtraction operation. For EPj D CRA, we 
add the CRA into uHoles linked list, and do nothing when EPj]]CRA. 
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4.3 Computing the IS 



In Section [31 we have well analysed the IS according to the geometry-topological 
relation between R and the outer ring (or inner ring) of UR, which is the basis 
of developing the targeted algorithm for computing IS. 




Multi-levels screening For get the ISj, we use tactfully the MBRs to do 
elementary screening before executing geometry operation on two entities. The 
first level screening is for OURj and R, we call their MBRs as BOURj and BR, 
respectively. Obviously, if BOU Rj ft BR, the ISj must be 0. As an example, 
the biggest rectangle illustrates query range R as shown in Figure [51 it has 5 
CMOs since BR X\ BEPj (j 6 [1, • • • ,5]), the grey polygons illustrate their URs. 
In order to evaluate the ISj, first, we execute the first level screening. In here, 
BR f[ BOURj (1 < j < 4), therefore, we can know immediately that the ISj 
is equal to 0. Note that, for UR$, we cannot pruned quickly by the first level 
screening. 

The second level screening help us find out candidate holes (cHoles for short) 
from X^i=i IURj. There are two cases: (1) using MBR of R to prune unrelated 
holes. As an example, the black solid line rectangle in Figure 7(a) illustrates 
query range R, the grey region is U R\ which has 7 holes (the white polygons), say 
IUR\ (i e [1, • • • , 7]). In here, R C OUR u therefore, the IS t = R- J^Li IUR i 
(recall case 4.2 in Subsection 13. 4p . In here, IUR\ and IUR\ can be pruned by 
the second level screening, the rest of holes are the candidate holes. (2) using 
MBR of OcR ( OURj $ R) to prune unrelated holes. Figure 7(b) presents this 

case, in here, R^OURi, therefore, the ISx = (R j) OUR^ - J2' i= i IUR i ( reca11 
case 5.2 in Subsection 13. 4p . Similar to previous case, in here, IUR\ and IUR\ 
can be pruned by the second level screening. 

The third level screening is using the MBR of candidate hole to prune unre- 
lated subdivisions. There also are two cases: (1) the subdivisions are derived from 
R. As an example, given we deal with the candidate holes in Figure 7(a) from 
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left to right. So we deal with IUR\ at first, note that, "R — IURi has two sub- 



divisions, Si and S2, as shown in Figure 7(c) Next, when we deal with IUR\ , we 



use the MBR of IUR\ to prune unrelated subdivision. In here, BS\ \\ BIUR\, 
so Si is pruned. Similarly, after we dealt with IUR\, S2 is subdivided into S2.1 
and 5*2.2, as shown in Figure [7(d)] So when we deal with IUR\ (or IURf), Si 
and 5*2.1 can be pruned. (2) the subdivisions are derived from OcR. For example, 
after we deal IU R\ in Figure |7(b)| OcR will be subdivided into two subdivi- 
sions, for the rest of candidate holes, we always use the MBR of candidate hole 
to prune unrelated subdivisions. 




Fig. 7. Illustration of Computing IS 



Sorting candidate holes in ascending order Similar to CRA, a candidate 
hole with large span also is more likely subdivide a single region into two (or 
more) subdivisions than a hole with small span is. In Subsection l4.2l we deal with 
CRA with large span as early as possible. On the contrary, we, in here, deal with 
candidate hole with large span as late as possible. So we sort them in ascending 
order rather than in descending order, before we execute geometry operation on 
them. The reason is that the new produced subdivisions can not be discarded, 
and should be used when we deal with the rest of candidate holes. If we deal 
with candidate hole with large span at first, it is possible that many subdivisions 
were produced before we deal with other candidate holes with small spans, which 
will incur more cost to compare the MBRs. As an example, let's revisit Figure 
|7(b)| there are 5 candidate holes, say IUR\ (i £ [3, • • • ,7]), given their spans 
are increasing from IURi to IUR\. Then, according to our tactic, we should 
deal with IUR\ at first, next, deal with IURf, and so on. Figure 7(e) illustrates 
the situation after we dealt with the lURf, there are two subdivisions, Si and 
52- So when we deal with IURi, the third level screening (recall Subsection 
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are activated since there are multiple subdivisions. Note that, in here we 
only execute two times comparison, i.e., we compare the MBR of IUR\ with 
the MBR of Si, and with the MBR of S^- However, the times of comparison in 
Figure 7(c) and |7(d)| where we deal with candidate holes from left to right, it 
takes 10 times comparison. 



Other tactics Similar to discussion in Subsection 14.21 once an single region 
is subdivided, we update the boundary, but there is a little difference. We use 
the new produced "multiple" subdivisions to substitute the old "single" region, 
and update the boundary of "each" new subdivision. Otherwise, we postpone 
updating the boundary. In addition, the tactic for dealing with entities that may 
result in hole is as same as that in Sbusection 14.21 



Procedure EvaluatelS ( ) { 
Input: URj, R 
Output: ISj 

(1) ISj <- ; 

(2) if (^(BUR^BR)) then 



(3) if ( (URj. Flag — 1) A ((OURj D R) V (OURj n R))) then 

(4) if (OURj D R ) then // case 4.2 

(5) tempIS <- R ; 

(6) else // OURj n R, case 5.2 

(7) tempIS <- OcR; // OcR = OUR, g R 

(8) Procedure HandleComplex (tempIS, UR.j) ; 

(9) ISj «- tempIS ; 

(10) else // other cases are straightforward 

(11) ISj "certain value" ; // please refer to Subscction l3.5l 



(12) return ISj } 
Procedure HandleComplex ( ) { 
Input: tempIS, URj 



(13) cHoles <- 0: rHoles <- 0; sign <- 

(14) for each holei £ URj 

(15) if (-•(Bhole i \\ BtempIS)) then cHoles <- cHoles U holei 

(16) sort cHoles based on span; 

(17) for each holei £ cHoles 

(18) if (sign = 0) then 

(19) if (tempi S=holei V tempi S dholei) then tempIS <— 0; return ; 

(20) else if (tempi SDholei) then rHoles 4— rHoles U holei; 

(21) else if (tempi SX\ holei ) then //do nothing 

(22) else // (tempi Snholei) 

(23) tempIS tempIS — holei; NS <— number of subdivisions in tempIS: 

(24) if ( ATiS >l) then sign <- 1 

(25) else // (sign = 1) 

(26) temp <- 

(27) for each subdivision Si (G tempIS) 

(28) if (^(BhoZeiff-BSi)) then 

(29) if (Si n fcoiej) then 

(30) temp <— temp U (Si — holei); delete Si from tempIS 

(31) else if (Si D holei) then rHoles <— rHoes U holei; break; 

(32) else // (Si ff holei), do nothing 

(33) tempIS <— tempIS U temp 

(34) if (rHoles 0) then 

(35) for each holei £ rHoles 

(36) for each Si £ tempIS 

(37) if ((vertex of holei)£ Si) then 

(38) let hold as an inner ring of Si ; break; } 



Fig. 8. Pseudocode of Computing IS 
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Procedure for Computing IS Figure [5] depicts the pseudocode of computing 
IS, which consists of 2 procedures. 

Procedure EvaluatelS( ) First, we use the MBRs of R and OURj to validate if 
they are disjointed, i.e., the first level screening. Otherwise, we, based on their 
geometry relation and URj.Flag, do different processing. Note that, we just list 
the case 4.2 and case 5.2 for saving space, other cases are very simple, which 
can be extended in a straightforward manner (refer to Subsection 13. 5p . 

Procedure HandleComplex( ) This procedure is used for dealing with case 4.2 and 
case 5.2. First, we obtain the candidate holes, i.e., the second level screening, and 
sort them based on their spans (line 13-16). Next, we process each candidate hole 
(line 17-33). If the entity 'iemp/S" has not ever been subdivided by candidate 
hole, we execute the line 18-24. Otherwise, we execute line 25-33. In here, we 
use MBR of holei and MBR of Si to prune unrelated subdivision (line 28), i.e., 
the third level screening. 

Note that, We add the result of a Si — hole" into temp and delete Si from 
tempi S (line 30), where temp is a linked list that store the modified Si (i.e., 
"Si — hole"), this is because the current hole will not make any impact on 
the modified (or new) Si. Therefore, we store the modified Si in temp for the 
present, and we combine the temp and tempi S until we traversed all old Si 
(line 33). In addition, when Si Z> holei, we postpone dealing with this hole by 
storing holei into r Holes (line 31), where r Holes is linked list used for storing 
holei(E cHoles) that must be a hole in ISj. At last, if r Holes is not empty, we 
put all the holes (e r Holes) into their corresponding subdivision (line 34-38). 

4.4 Evaluation of Probability 

In this Subsection, we address two methods for evaluating the probability, one 
for uniform distribution pdf, another is for arbitrary distribution pdf. 



Quick Method In here, we present the quick method, which is simple and 
efficient for uniform distribution pdf. As Pfoser et al. [12] pointed out, uniform 
distribution corresponds to the "worst-case" scenario. In this case, fj(x,y) = 



1/AURj, where AURj 



is the area of URj. For clearness, we call the area of 



ISj as AI S j , 



then, we have Pj = AISj/AURj. Therefore, the crucial task is 



to evaluate AURj and AlSj . Our data representation, LBDS (recall Subsection 
13. 6(1 . has a great advantage on evaluating their areas, since we use polygon as 
a basic element, the area of a polygon can be derived based on the following 
formula. 

1 



S = 



( 




+ 


x 2 x 3 


+ ... + 




) 




yi 2/2 




yi 2/3 




Vn 2/1 





(2) 



where 



Xl X2 

2/i 2/2 



(x\ -yi— X2-yi), and (xi, 2/1) denotes a vertex, other symbols have 



similar meanings. Therefore, both the AURj and AlSj can be easily obtained. 
Specifically, we can get the AU Rj based on the following formula. 
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Fig. 9. Query and I/O performance vs. N 



K 

AURj = AOURj - AIUR) (3) 

i=0 

where AOURj is the area of OURj, K (> 0) is the number of ZioZes in URj, 
AIURj is the area of the i th hole in URj, i < K. Similarly, For AlSj, we have 

NS NS K 

AIS 3 = £ ASi = {AOS % - AIS ^ ( 4 ) 

i=l i=l k=0 

where NS (> 1) is the number of subdivisions in ISj, ASi is the area of the i th 
subdivision, AOSi is the area of outer ring from Si, K (> 0) is the number of 
holes in Si, AIS^ is the area of the k th hole from Si, k < K . 

Monte Carlo Method When the pdf is not uniform distribution, we have to 
utilize other methods in order to get the Pj. In general, numerical integration is 
a desired choice. We adopt Monte Carlo (MC) method for achieving this goal. 
The pseudocode of MC method is illustrated in Figure [TUJ 

Simply to speaking, there are 4 steps. (1) We generate repeatedly random 
points in the MBR of URj until the number of valid random point is equal to a 
pre-set value (N see d for convenience). (2) For each random generated point, we 
validate whether or not it is locating in URj. Note that, we should differentiate if 
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there is hole in URj . We take a random point as a valid random point only if the 
point is indeed locating in URj. (3) If it is valid random point, we, based on the 
coordinate of valid random point and the pdf, evaluate the value of fj(xi,yi). 
Meanwhile, we accumulate this value to a variable {SUM\ for convenience). 
In addition, we validate whether or not it also is locating in ISj. If so, we 
also accumulate this value to another variable (SUM2 for convenience). (4) we 
compute SU M2/SUM1 when the number of valid random point is equal to N seee i, 
and assign the value to Pj . 



Procedure MonteCarlo ( ) { 
Input: URj, ISj, N sccd 
Output: Pj 

(1) Pj «- 0, SUM! <- 0, SUM 2 <- 0, N t <- 

(2) repeat 



(3) Pti ^— generate a random point in the MBR of URj 

(4) if {Pti locates in OURj) then 

(5) if {URj, Flag = 1) then sign <- 

(6) for each hole in URj 

(7) if {Pti locating in hole) then sign 4— 1; break; 

(8) if {sgin—1) then continue; //shift to generate next random point 

(9) SUMi <- SUAh +fj{Xi,Vi); Ni «— Ni + 1 

(10) for each IS\ G ISj 

(11) if {Pti locates in OJSj) then SUM 2 «- SUM 2 + fj{x%,yi) 



(12) until N% = N Bccd 

(13) Pj <- {SUM2/SUM1) 

(14) return Pj } 

Fig. 10. Psuedocode of MC Method for Computing Pj 



5 Indexing Schemes 

It can work correctly using the proposed approaches. However, there still is a 
considerable space for improvement. In Section |4j we use MBRs to determine 
CMOs and CRAs. However, the manner to prune unrelated moving objects and 
restricted areas is one-by-one. It is low efficiency once M and/or N increases. 
In here, we adopt classical R-tree to index. There are two schemes even if we 
choose R-tree to index. 

5.1 Immediate indexing 

Since all RAs are static, their MBRs can be obtained easily. In addition, since 
we already stored location and distance threshold of MO in database, the MBRs 
of all MOs also can be easily computed. Therefore, for all RAs and MOs, we 
can index immediately them based on their MBRs. Once we finish indexing on 
all RAs and MOs, then, we can, using the two indices, quickly prune unrelated 
RAs and MOs, respectively. If a MO reports its' new location to database server, 
we update database record and location index. Note that, there are two indices 
that one is for indexing RAs, another is for indexing the locations of MOs (by 
incorporating distance threshold) . We can construct the two indices in concurrent 
manner, or in one-after-another manner. 
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5.2 Preprocessing based indexing 

A conspicuous characteristic of the above scheme is that we have to compute 
UR on the fly. In here, we present another indexing scheme, that is, we pre- 
compute the URs, and index them using R-tree. In here, we construct RA index 
at first, then we pre-compute UR and construct UR index. In the process of pre- 
computing URs, we can fully take advantage of the RA index. In this scheme, 
it is possible that a MO reports its' new location to server in the period of 
constructing UR index. For this issue, we differentiate two cases. (1) the UR 
of this MO has ever been pre-computed and indexed in UR index; (2) the UR 
of this MO has not been pre-computed. For this issue, it is easy to tackle. For 
instance, for the first case, we can re-compute this UR and update the current 
UR index right now, or we just store a label for it, we, based on label information, 
re-compute this UR and update UR index after we pre-computed and indexed 
all 'old' URs. Certainly, we also need update the location record in database. 
For the second case, we only need update the location record in database. Note 
that, in this scheme, i.e., preprocessing based indexing, we also construct two 
indices, i.e., RA index and UR index. 

6 Performance study 

In this section, we discuss the settings of our experiments and present the impor- 
tant performance results for PRQ over uncertain moving objects in constrained 
2D space. 

6.1 Experiment Settings 

There are two types of datasets in our experiments, one is for representing RAs, 
another is for representing the locations of MOs. We use a number of polygons 
to denote the RAs, and let them uniformly distributed in the 10000x10000 
2D space. Furthermore, we use a number of points to denote the locations of 
MOs, and let them randomly distributed in the 2D space but with an external 
constraint that these points cannot locate in any RA. For simulating the MOs 
with different characters, we randomly generate different distance thresholds for 
different MOs. 

All codes used in our experiments are written in C++ language. All exper- 
iments are conducted on a laptop with 2.16GHz dual core CPU and 1.86GB of 
memory, running Windows XP. The page size is fixed to 4096 bytes. The max- 
imum number of children nodes in R-tree is fixed to 50. Other parameters are 
illustrated in Table 1. 

6.2 Results 

We first present the experimental results when MOs are subjected to uniform 
distribution in Subsection 16.21 followed by presenting our experimental results 
when the pdf is Distorted Gaussian in Subsection l6.2l Note that, since the setting 
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Table 2. Parameters Used in Our Experiments 



Parameter 


Description 


Value 


N 


number of objects 


[1 x 10 4 , ■ ■ ■ ,5x 10 4 ] 


M 


number of RA 


[1 x 10 4 , ■ ■ ■ , 5 x 10 4 ] 




size of R 


[(100 x 100), ■ ■ ■ , (500 x 500)] 




edges of each RA 


[4, ■ ■ • , 64] 


N seed 


number of valid random points 


[10 2 , ■ • ■ , 10 8 ] 


DTj 


distance threshold of Oj 


[20, ■ ■ ■ , 50] 



of this problem is in constrained 2D space, existing methods cannot support PRQ 
over uncertain moving objects in this setting, we will not compare our methods 
with existing methods. 



Uniform Distribution In terms of our methods, we compare the following 
several schemes: basic framework with quick method (BQ); basic framework 
with quick method and immediate indexing (BQII) ; basic framework with quick 
method and preprocessing based indexing (BQPI). 

Impact of N For efficiently investigating the impact of N, we fix other param- 
eters and only vary the value of N (from 10000 to 50000). Specifically, we let 
rectangle with 40 x 10 size as the RA, and set M = 50000. We generate ran- 
domly 50 rectangles with 500 x 500 size as the input of query, and we run 10 
times for each test group, then compute the average query time and I/O time 
for estimating a single query. Figure [9] illustrates the results. Both query time 
and I/O time are increasing with the increase of N for the three schemes. The 
main reason is derived from the fact that more CMOs will locate in R when N 
increases and other conditions are constant. For these increased objects, we also 
have to fetch data from database (incurring more I/O time) and evaluate the 



probability (incurring more CPU time). BQII outperforms BQ (9(a) and 9(c) I 
since BQII utilize the R-tree, which contributes to the decrease of the number 
of fetching records from database (incurring less I/O time), and the decrease of 
comparing operations between the MBR of R and the MBRs of CMOs (incur- 
ring less CPU time). BQPI outperforms BQII ( |9(b)| and [9(d)| since BQII evaluate 
UR on the fly (incurring more CPU time), which have to fetch RA data from 
database (incurring more I/O time). For the rest of experimental results, if BQPI 
outperforms BQII or BQII outperforms BQ, and if there is no other reason, we 
will not explain repeatedly the details for saving space. 

Impact of M For well investigating the impact of M, we use similar test method 
as previous paragraph presented, but with a little modification. Specifically, we 
fix N = 50000 and vary the value of M (from 10000 to 50000). The results are 
illustrated in Figure [TT] The size of M has a great impact on BQ, but for BQII 



or BQPI, the query time is near to constant when we vary the size of M (11(a) 
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and 11(b) ). In particular, the I/O time for BQPI also is near to constant (11(d) ), 
this is because BQPI index directly on MOs and URs, we need not fetch RA 
records from database. Note that, the I/O performance of BQPI does not always 
outperform that of BQII (11(d) ). This is because, when M is a small value, the 
time of fetching UR record from database is larger than the sum of time fetching 
location records and fetching corresponding RA records. Though it is so, the 
query performance of BQPI is still outperforming that of BQII (11(b)). This is 
because BQPI need not compute the UR on the fly, which contributes to a big 
saving in CPU time, and this saving is large than the value that the I/O time 
of BQPI minuses that of BQII (when it is a positive number). 
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Fig. 11. Query and I/O performance vs. M 
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Impact of RA e dg e s We vary the number of edges (raging from 4 to 64) in each 
RA for studying the impact of RA e d ges . Specifically, We let M — 50000, and let 
equilateral polygon as RA. For each RA, it has the property that the distance 
from its' center to its' vertice is 20. For different test groups, we just vary the 
number of edges in each RA. Other parameters without stated in here is same 
with that in previous paragraph. Figure rT2] illustrates the results. Both query 
time and I/O time are increasing as RA ec [ ges increases for the three schemes. 
In terms of BQPI ( |12(b)| and 12(d)), there are several reasons. First, RA itself 
will enlarge with the increase of the number of edges, since we set the distance 
from its' center to vertice is a fixed value, then, an equilateral polygon with 
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Fig. 12. Query and I/O performance vs. RA e d ge 



more edges will occupy more areas. Therefore, as the RA e d g es increases, some 
RAs that do not intersect with EPj , under the RA e d ge s with a small value, will 
possible intersect with EPj , so we will have to evaluate the subtraction between 
EPj and these RAs (incurring more CPU time). Second, with the increase of 
RA e dges , the complexity of UR should be upgraded, which results in taking more 
time to fetch UR record from database (incurring more I/O time). 

Impact of R B i Z e Figure [T3] depicts the performance results when we vary the 
size of R. In here, we let rectangle with 40 x 10 size as the RA, and both M 
and N are set to 50000, we generate randomly 50 rectangles as the input of 
query. For different test groups, we vary the size of R, from "100 x 100" to 
"500 x 500". Other parameters without stated in here are same with that in 
previous paragraph. We observe that, with the increase of R S i ze , both the query 
time and I/O time are increasing for the three schemes. This is because more 
CMOs will locate in R (with the increase of R S ize), then more location records 
and corresponding RA records, or UR records, should be fetched from database 
(incurring more I/O time). In addition, for those increased CMOs, we also have 
to evaluate their probability (incurring more CPU time). 



Distorted Gaussian Distribution In the following experiment, we assume 
the pdf is Distorted Gaussian (pdjocix, y) for short). The definition of Distorted 
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Fig. 13. Query and I/O performance vs. R s 



Gaussian is derived from general Gaussiar0. Given the pdf of general Gaussian 
is pdf G (x,y), and a coefficient A, where A = J v( ^ xy ^ eUR ^pdf G (x,y)dxdy, then 
we get the pdfu G (x,y) as follows. 

pdf DG (x,y) = i . (5) 

I U , otherwise 

In theory, we should have calculated the coefficient A and translated pdf G (x, y) 
into pdfocix, y) for each MO. Fortunately, we neither need calculate it, nor need 
any translating for any MO in practice. The reason is that the A will be elimi- 
nated when we substitute pdfnG(x,y) with pdf G (x,y)/ X in following formula. 

p J = I y2p d fDG(xr,yt) I ^ I y_]pdfDG{xi,yi) I (6) 



^2,pdfDa{xi,yi) J -r- I ^pdf DG {x ll y i ) J 

Vi=l / \i=l ) 



where N\, N2 are the number of random points that locating in URj and 
the ISj, respectively (recall Subsection I4.4p . In our experiment, the standard 
deviation of pdfc{x,y) (used for defining the pdfjj G (x,y)) is set to DTj/5, the 
mean u x and u v is set to L l ^ st .X and L l ^ st .Y ', respectively. 

Workload Error We investigate the workload error in order to find out ap- 
propriate value for parameter N seec i- Then, we use the chosen value to verify 



The pdf of general Gaussian is - ^ e ~ 2<5 , general Gaussian has an infinite input space, 

which is symmetric. However, the input space of Distorted Gaussian used in our experiment is 
limited to U Rj and it may be not symmetric. 
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the query efficiency under Distorted Gaussian. Two kinds of common workload 
errors are relative workload error (RWE) and absolute workload error (AWEfl. 
We choose AWE to evaluate the efficiency of our proposed approaches. The main 
reason is derived from the fact that to ensure a very small RWE will take much 
more time than to ensure a same size AWE. In most small AWE is enough 

to satisfy our demand. We validate this by the following method. 

We generate an object Oj with DTj = 20 at the centre of 2D space and 
compute URj. Next, we generate a number 100 of query ranges which have same 
size (500 x 500 rectangle) and have different intersections with this URj . At first 
run, we get the real answeiQ of each query by setting N see d = lOe + 8. Next, we 
vary the size of N see d to get several groups of AWEs and RWEs. When N see d = 
700, t he AWE is 0.95% (i.e., 0.0095), and the RWE is 10.75% as shown in Figure 
14(a) Obviously, it is unreasonable if we choose 10.75% as RWE. Otherwise, 



this means that returning a value of 89.25% will be tolerated even if the real 
value is 100%. Therefore, in order to get a smaller RWE, we have to increase 
N seed . By doing so, we get RWE= 1.12% and RWE= 0.05% at N seed = 50000. 
Therefore, for assuring a value 1% of RWE, we have to set N see d > 50000 at 
least. However, even if we choose N see d = 50000, and just for evaluating a single 
objects' probability, it takes about 2871 millisecond. In view of these, we choose 
AWE to estimate the returning results. 

In addition, we vary the value of DTj to verify the impact of DTj on AWE 
(using similar method as above). Figure [l4(b)| shows that an object with smaller 
DTj, as a whole, usually need larger N see d if we want to assure same AWE. 
Based on this fact, we choose N see d = 700 in our next experiment, which will 
ensures a number 0.01 of AWE. 

Query Efficiency When pdf is Distorted Gaussian, we compare the following 
schemes: basic framework with Monte Carlo method and immediate indexing 
(BMII); basic framework with Monte Carlo method and preprocessing based in- 
dexing (BMPI). Since Monte Carlo method is only relevant to the evaluation of 
probability, the I/O efficiency is same with that under uniform distribution. So 



RWE : 



AWE — [ estimated value — real value 



In fact, this value is very closer to real value other than absolute real value. 
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Fig. 15. Query Efficiency Comparison 



we just compare their query efficiency. In here, we present the performance re- 
sult when we vary the R S ize ■ Other parameters (without stated in here) are same 
with those when we investigate the impact of R S ize under uniform distribution. 
Figure 15(a) illustrates our result. As we expected, the query time are increasing 
with the increase of R S i ze for the two tactics, and BMPI outperforms BMII. The 
reason is similar to that discussed in Subsection l6.2l In addition, we compare the 
performance between BMPI and BQPI, all parameters are totally same except 
the pdf. Figure |T5(b) | depicts the result. We observe that the query time for the 
former is far more than that of later. This is because the time for evaluating a 
single object's probability is relatively long when pdf is Distorted Gaussian. 



7 Related Works 

In terms of range query over uncertain moving objects, researchers have made 
considerable efforts [19I20I12I21I22I16I32I23I24I13I33I34I35I18I17I25I14I36] . For ex- 
ample, Sistla et al. [20123] proposed MOST model for efficiently representing the 
location of moving objects, and they proposed a temporal query language, FTL, 
for providing more intuitive and simple query language, however, their approach 
just provides qualitative answer, i.e., certain object may or must locates in cer- 
tain region at sometime rather than how much the probability is. Trajcevski et 
al. [24] proposed to model an uncertain trajectory as a 3D (2D for space, ID for 
time) cylindrical body, their approaches also just give qualitative answer. 

Wolfson et al. [2] presented a probabilistic method for processing range 
queries, they assumed all the objects were travelling on routes, their research 
belongs to "coarse-grained" query. Zheng et al. [25] represented the uncertainty 
of the objects moving along road networks as time-dependent probability distri- 
bution functions, and they proposed an indexing mechanism, UTH, for efficiently 
managing and retrieving massive trajectories, obviously, their research also be- 
longs to "coarse-grained" query. 

Pfoser et al. 12 proposed to represent the uncertain region of a moving 
object as a circle, and to represent the historical trajectories of moving objects 
as polylines, they focused on querying the past location of moving objects during 
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an interval of time. Recently, Zhang et al. [T7] devised inference method for the 
prediction of future locations. 

Cheng et al. [13] characterize the uncertainty of a moving object as a closed 
region with a probability function, and adopted VCI to manage the moving 
objects. Tao et al. [35] investigated probability threshold range query, they pro- 
posed probabilistic constrained rectangles (PCR), which contributes to prun- 
ing/validating a majority of nonqualifying/qualifying objects, and proposed an 
indexing technology, U-tree, for indexing uncertain objects and reducing I/O 
cost. Chen et al.[16] addressed location based range query, i.e., the location of 
querying issuer is not static. However, to the best of our knowledge, there is 
no prior work addressing PRQ over uncertain moving objects in constrained 2D 
space. 

8 Conclusion 

PRQ over uncertain moving objects is attracting extensive attentions, we, in this 
paper, make a more realistic assumption that objects are moving in constrained 
2D space. We discuss technique for representing the different entities, and present 
a framework for query processing. We, by carefully analysing the geometry- 
topological relation between different entities, design elegant algorithms for com- 
puting the UR and IS. In order to obtain the probability, we discuss two methods, 
quick method and Monte Carlo method, which are used for different pdf. More- 
over, we introduce two R-tree based indexing schemes for further improving the 
efficiency. We demonstrate through extensive experiments the correctness and 
effectiveness of our proposed approaches. In future, we will study other typical 
queries (such as KNN) over objects that are moving in constrained 2D space, 
these queries are more interesting and challenging. 
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